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t ORRU II D APPEAL BRIEF 

20 Mad Smp \ppeal Ruei - Patent 
Cominisotnnej foi Patents 
PO Box 14 SO 
Ue\andria \ A 223 1>- 14-0 

25 Sir: 

Applicants heieln submit this Con et ted Appeal BriL-f m response to the 
Notification of Nuu-( omplsant \ppea! Buef dated Apiil °> 200iS The outnnal Appeal 
Buef was submitted on MaKh 18, 2008 to .ippcdi the final rejection, dated Deeembei 18 
30 2007, of claims 1 thioudi ^ of the aboxe-tdeiuified patent application 



RP AL PARI ^ l\ INIf RI-ST 
Hie pitsent application is assigned to IBM Cotpoiation as eudeneed In 
an as-siimmetit lecoided on September 2^, 1999 m the tinted States Patent and 
35 hadenidrk Oflsee as Reel O1027J Ftame 0023 1 he assignee IBM ( oi potation is the 
real partv in interest. 
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STATUS OF CLAIMS 
i lamis I ilmutgli ^ ate pending m ihe akn e-idenuffed paten! 
application Claims M 8 10-! 4 1 0- 1 * * 21-20 and 2S ^ remain rejected tmdei 3^ 
5 I' S ( ^I02(bl as being anticipated by ( hen et al "Speakei fn\ itonment and < hannel 
{ hange OetectU'j) and C btstu sia the Bavesian Sntormahon Cutetson Pkx of the 
D \ RP\ Broadcast News Wotkshop (Feb 1<W*) heiemaftei, lcfcitcd to as "Chen hi 
addition claims o ~ '» 20 and 27 lematn itjtUed urxkr ^ I SC i?l0>{a) as being 
unpatentable o^cr ( hen m uew of well known pnoi ait and claim Is jomanii tesetted 
10 undei ^ C^C ^ 10 as being unpatentable i^u (.hen sn \iev\ oi kleidet et ai 
tf nsted states Patent \o ^ L >*0 ^18} Claims I H« 2i and H>-^ are being appealed 



M Ml SOI- KMLNDMLMS 
1 bete ha\c been no amendments filed subsequent to the find {ejection 

St MMA.RV OF CI AIMI 1) SI BM,t 1 M AT TLR 
Independent claim 1 leqnne-, a method tor ttacking a speaker in an audio 
soutte {patio ^ Inus S2-2o and page 8 Inus lo-P'k -aid method compiling the steps 
ol idemihmg potential segment boundatie^ hi .said audio boiuee <paee " lines 10-24 
HO I *00 and HO > *0U>. and ehisteimg homogeneous segments ftom said audio 
soutee (page 8 iuie^ l-<>) substantialK eonatnentk uuh said identmmg step {page 2 
lines 10-F, HO 1 200 and HO 2 200) 

independent claim lo requites a method foi tiaekinga sp^akei m an audio 
nance {page * lines 12-26 and page ^ hms 16-l l >) said method eomptimig the steps 
of iduittrvmg potential segment bounclaties m said audio ^ouice tpat-e lines 10-24, 
i Kj I >00 and 1 10 ^ ^00) and dusteiing segments from said audio vuuee {page 8 
lines S-oi toireopondmg to the same spuikei substantialK conumunh with said 
identifying >tepipage2 Um-, 10-17 and p'tge s hues lo-IS i 10 i 200 and HO 2 
200). 

Independent eiami 2"* t equates a method ioi Hacking a speakei m an audio 
souiec (page ^ lints 12-20 and pane 8 lin^s 16-10) said method comptistma the Mcps 
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oi idenfdung potential segment bomidaitcb dnniu <j pass fiuougli ^aid audio souice 
ipaac 7, lines 10-24. HO i 100 and HO * 'OOj and clustering u-unents from said 
audio soukc ({'age ^ tines j-o) conespotiduig to (lie <-<iine speaker during sfifd same pass 
thiough saul audio souice (page 2 lines i 0-1" HO 1 20<>andSIG - ~ f}( ^ 
5 Independent claim 10 lequiK^ a sWem (PIG 1 H)0) ioi packing a 

^peakei in an audio souice (page ^ hues 12 2o and page S lints 16-19) aunpusmg a 
meniot\ f fk» 1 120) that stoics computet - eadable code, and a pioeessoi {FIG 1 ! 10) 
opeiatiseh coupled to odid memon said ptoeesboi eoniigwed to implement yanl 
computet -i eadable code said compute! -readable code eonflgmed to identifs potential 

10 segment bouudanes m said audio souice (page "7 lines 1 0-24 I Rj i 100 and I Rj 1 
HJOj and cluster homogeneous segments fiom said audio souice (page JS, line-. 1-6) 
substantial coiium cutis with satd identification oi segment boundanes tpage 2 inies 
16-17 H(» i 200 and I (O 2 200 1 

Independent claim 11 lequites an article of mannlactuic computing a 

15 compute* i eadable medium ha\mg computet readable code means embodied theuon 
said computer readable ptogiam code means eumpn-mg a step to nknuf\ potential 
segment boundaries m said audio suuice (page ~ line-, ".0-24 FIG I 100 and FIG 1 
*00> and a step to Justei homoguKotio segments fiom -aid audio source (page s lines 
1-6) substuutialh concur tenth \si\h satd idenufieatiou oi segment boimdaiies (.page 2 

20 lmesKi-r.HO I 200 and HO 2 200) 

Independent claim 12 feqnnes a system <J1G I IDOi foi tracking a 
speakei in an audio &ouice tpage \ Imes 12-2^ and paue 8 hues- lo-lsO compnsmg a 
memoir (FIG 1 i20) that itoie> compniei-~eadahie code and a pioces^ot (PIG 1 110) 
opetatneK vonpled to said memoi\ said protcbsot configuied in implement said 

25 computer-readable code satd computer-t eadable code confirmed to identify potential 
segment bonndaties m said audio suuice {page " lints 'J' 21 HO I 100 and \ IG i 
100) and clu^tei segment^ horn said audio sou ice (page ir lines l-o) tvi responding t>> 
the same speaker substantially coiuuneith with said identification of segment 
bouitduues ipaae 2 hues 1 6-1" and page s hues 16- IS NO 1 200 and I R j 2 200} 

30 Independent claim 1* tequtte> an amcle of manutaetuic eompnsmu a 

computet i eadable medium haunt; computet t eadable «.ode means embodied thtrton 
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s<iid computes loadable piogtam code means Lomptising a step to idcnuiv potential 
segment boundaries in said audio oouue (page 7 Ihk-s H)-24, MO 1 i(H» and MG ^ 
^00> and a step iu chtstef .segments ftotu Mid audio .-.outce {{aye 8 hues 1-6) 
conesponding to the same speakei substantial l\ u>m.u tenth with said identification of 
5 segment boundaries (page 2. lines i<>- i "* and page 5 Une^'D-iS TIG 1 2U0 and FIG 2 
200). 

Independent claim 34 requires a system (FIG. I: 100) for tracking a 
speakei m an audio source ipage \ lines 12-2f-> and page 8 hues lo-i9) compnsmg a 
memon fHU I i 20 } that ^toies, computet- eadabje code and a pios-essot (MG 1 HO) 

10 opei.nnoK coupled to said mctmm said processoi configured i<"> implement said 
computer-readable code said tomputer-icadabie code configured to identsn potential 
segment boundanes duung a pa->s through said audio ooutce (page 7 lines 10-24, FKi 1 
-00 and 1 Kj > >00) and duster segment.-. lu»m said audio sou tee (page b lines l-o) 
eonespoudmg to the same ^peakei durmg said same pas^, through said audio source (page 

15 2Jrnesio-17 and page s lines io-lKJ IG 1 200 and Fie 2 200) 

Independent claim ^ requires an article of manuiactuie compnsma a 
compute! readable mediuiu bawma computet readable code means embodied iheteon 
said computet readable piogram a>ck means comprising a -step to identih potential 
segment boinidaires duruig a pass through seid audio source (page 7 lines 10-24 FIG I 

20 300 and } IG 3 ^00) and a step to dusta segments from said audio source (page i> lines 
{-<■)) corresponding to the same speakei during ->aid ^amc* pas-, ihiougb said audio source 
(page 2, lines 16-1 7; FIG. 1 : 200 and FIG. 2; 200). 



SI MI \jj M'Ol tiROl NDSOi Rl M ( HON IP BE Rl ViTW i DON \PF1 AL 
25 Claims! 16 2-> atuHO ^ au reicc ted undei ST -d 02(b) at beine 

anticipated by Chen et al. 

ARGUMENT 
Independent Claims 1. 16. 2^. and 30-35 
30 Independent claims 1 In. 2> and 3lM^ aic tejeUed undci ISC 

^ I02t'b) as being anticipated b\ Chen et al 
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In the Office \ction dared August 27 2002, the Lvimmer asserted that 
Chen discloses speaker em tsomneni and channel change defectum and clustering \ fa ihe 
Ba\essao Infoimauon O stetson foi seumeMmg the audio stream into homogeneous 
5 legions accoidmg to speaker identic, environmental condition and channel condition and 
clustering speech segments into homogeneous Uustcts acceding to vptaket identm, 
en\ itoumentai condition and channel (cmntf page ! paragraph 2} winch leads on the 
e farmed 'method of twteking a speaket m an audio Mimce said method computing the 
steps of identifying potential segment houndaues m said audio soutce. and clustering 
10 homogeneous segments faun said audio source substantially concur tenth with satd 
tdentitvmg step. 

In the Response to Office Action dated Decctnbet 2<\ 2002 Appellants 
submitted that while (lieu discloses segmenting an audio stream mto homogeneous 
regions and chweimg speech segments mto homogeneous ciustets, the audio stjeam is 

15 fa -i segmented and then clustered Appellant,-, noted as further evidence that the 
clustering m ( hen to pet formed otiK aftei the audio stream has been segmented that 
Section 4 1 indicates that each segment is computed to all ofhet segments before 
clusteting ts finali?ed In addition Section 4 2. first paragraph indicates that the data set 
consist*, of an audio file that has been liand-^egmented into ^24 shoit segments 

20 In the Office Action dated Match 7, 2003. the h\annnei notes that the 

ptfot att cites that 1 out segmentation algorithm can snccescfulh detect acoustic changes ' 
(C hen abstract) and that \\e fust examine whether our detected change points were 
true" (( hen Section "5 > pasagsaph 3 } The Examiner asserts that this suggests that 
Chen not onh emplov> ib own segmenting mechanism, but aKo capable of cotnbmmg 

25 segmentation with cliiotenug "substantial l\ concurrenth 

Ihe I \annnei also asset ts that Chen suggests thai the dusteung dues not 
need completely segmented data such that a clusteting process ma\ be combined with a 
segmenting pioeess together substaruialh concunentlv, since Chen disdoses that "it is 
also dcai that our enter ton can be applied to top-down methods (Chen Section 4 1 

30 paragraph 4, S 
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The Lxanuner futther asserts that a clustering step tan be inserted in {be 
segmentation loop m Chen Section > 2. paragraph 1, and that then is capable of 
combining segmentation and ckistestng since the ^einuemaiioti and clustering algoifthtns 
aie based on the Bl( algotuhm and mixc equations {2) (.i J, and (8i ha\e no limitation foi 
5 combining segmentation and clusteiuig 

\ppellants acknowledge thai ("hen unpkns st^- own ^egmeuiin" 
mechanism, but find no indication oi or suggestion 10 peifonn segmentation and 
dustenng suhstantiaib coneuttemh" in The cited text Appellant^ note that the 
bxammer asserts that Chen is nt/w^k <>t this but doe?, not asseit that i. hen suggests oi 
10 discloses combining segmentation wuh dus*eung substantial c«ncLHiemh 

\ppcllants also note that, in the top-down method a la pothers is made 
tegatding the number of ctiistets Then, a test io made to detettmne if the number of 
cluster h\ pothered actuallv 'fits the data Utemameh, in the bottom-up method, the 
number oi clusters ts determined from the data Thus the capabiht\ ro utih/e a top-down 
15 method does not suggest that segmentation is pet formed substantially concuuenth vuih 
the clustering process. 

Regarding the final asseition made h> the 1 ^annuel Appellants also note 
that whether or not Chen h cjfhiHc of combining segmentation and elusramg, theie is 
no disclosure or suggestion to do m> 
20 Thus C hen doe« not disclose of suggest a 1 method of tiaekmg a speakei 

in an audio somce, said method compiling the siepo of rdenurXmg potential segment 
boundaues in said audio source and clusteimg homogeneous segments ftom said audio 
source substantially concitnemh with said idenlifwng step," as ic-quned b\ independent 
claims 1 So >0 >1, >2 and >^ oi tiie present imentmn Smnlarh, independent claims 
25 2->, 34 and ^ requite that the segmentation and ckistenng aie petfoimed on the 'same 
pass through said audio source.. 

Response to Examine; s Ansuet dated December 1" 200 
In the ! xammei's Answet dated Decembei P 200^ the bxaminei states 
that it is belie\ed that the limitation "substantialK amumemK" ha^ no patentable 
30 weight because the Applicant does not ha\e am cleai definition and ot Je^cnptfon in the 
claim oi m the specification about tins Imitation and does not gi\e anv conditions to 
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apply this limitation. The Examiner also asserts that the prior art explicitly and or 
implicitly discloses all the limitations regarding claim 1, including the limitation of 
"substantially concur! enily.,^ based on the interpolation of the claim language and the 
understanding (of the) prior an teachings. In particular, the Examiner asserts that the 
5 performance of the two steps (segmentation and clustering) may be associated with many 
time related factors, including computing speed, simple rate., and total stream size. 

The Examiner further asserts that the fact that the clustering in Chen is 
performed only after the audio stream has been segmented and that each segment is 
compared to all other segments before clustering is finalized is not relevant to claim ! 
10 since claim i does not recite these limitations. 

The Examiner also notes that one cannot show nonobvsonsness by 
attacking references individually where the rejections are based on combinations of 
references, 

Regatditig the 1 \ammet s assetuon that the limitation "substantial^ 

15 concujtemh has no patentable weight \ppt Hants note that the wotd suhstaunaSK ' has 
a well known and well undeistood Jefintion in claim language Its meaning is 
sndiCieulU cleai m the teadunc^ of the speuhcanon siidi that a peison oJ otd man si ill 
in the ait would understand the limitation without the need to appK conditions 

Regaulmg the ! vumnet s assertion thai the prior ait expltutb and ot 

20 imphcitK discloses all the limitations iegafdmg claim 1 meiudmu the limitation ol 
Mihstannalh concuttentK ha^ed on the mtnpretauon of the daim language and the 
understanding tof the) pnoi ait teachings \ppellants note that the broad inteipietations 
made b\ the Examiner ate not consistent with the specification and aie not ion\istent 
with the uUeipietation ol the opeci+lcatioti that a peisou ot oulmai) skill m the an would 

25 make \s disclosed on page 2 {h«o So-2{->) ot the ougmal specification, 'the pietent 
inxention umuinenrh w \g>KM-> an O'u'io /lie <mJ uWim ih< mi^?'c//^ eoite^pondmg to 
the same sptakei 1 Ihu^ the teim suhswntialK concunent is idakd to the j^indLi 
i \ia<th)it of the segmentation and chiseling steps See also EHt 2 

Moic speuftcalh \pptlhmt« note that the^e limitations ate cleat h 

30 capuued m claim 1 which secttes the limitations ol idenulving potential segment 
boundaiies in said audio source and (Justeimg homogeneous segments fu m said audio 
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source substantially concurrently with said identifying step. Claim 1 requires the 
clustering of homogeneous segments substantially concurrently with said identifying 
step i 'hen theteibie. aetualh teaches imas front the present invention by leaching thai 
the clustering is performed only after the audio stream has been segmented. Thus,. 
5 contrary to the Examiner's assertion, the limitations cited bv the Examiner in reference to 
Chen are clearly relevant to the consideration of claim I . 

Appellants also note that the references were not attacked individually, but 
were reviewed to demonstrate that none of the references contain the cited limitation 
required by the claims of the present invention and that, therefore, the prior art does not 
10 pose a bar to patentability. 

Response to Final Office Action of December 18, 2007 
The Examiner asserts that the claimed "clustering homogeneous 
segments... substantially concurrently with said identifying step (segmentation}" does not 
exclude the situation that "the clustering is performed only after the audio stream has 
15 been segmented," 

Appellants note that the Examiner is including the case where the 
clustering and segmenting are performed sequentially . This is contrary to the claim 
requirement that the clustering and segmenting are performed " substantially 
concurrently . " The term "substantially concurrently''' should be given patentable weight. 
20 fn the cited example, however, the clustering and segmentation are performed 
sequentially; there is absolutely no degree to which the clustering and segmentation are 
performed "substantially concurrently." Thus, the Examiner's interpretation of the cited 
claims is no£ a reasonable interpretation. 

Moreover, die loop illustrated in FIG. 2 demonstrates that the 
25 segmentation and clustering are performed substantially concurrently, as segmentation 
may be performed both before and after clustering. 

The Examiner further asserts that "Chen's disclosure satisfies the claimed 
limitation under at least this minimum condition/assumption, because Chen recites 
'comparing two models, one models the data as two Gaussian(s); the other models the 
30 data as just one Gaussian' to detect the changing point for segmentation (Chen. Section 
3.1, page 4), such that at least two data groups i segments) are segmented {before} for 
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clusteimg speakers (Chen section 4 page Sj" 

KppcHantfc note that Chen describes a Oamsian distubution with means 
and \aiiaiKc> Chen assumes ih.it the means of ail She .signals si sk can be computed 
litis is onK feasible it ail the data required to compute the mean is available Fui 
5 example imagine there is a pipe cumexmg water fiom point A. to point R The observer 
at point B docs nut know how water wdi come o\ei ( Phis is analogous to a radio signal 
oi \ideo -trcam } In order to compute the mean \ plume of water emanating tiom a pipe 
pei utm time aii the water can he collected into a contamej the time required tot the pipe 
to uo di\ can be measured, and the \o!ume of watc-i collected can be divided b\ the time 

10 required to collect is litis m effect !-, Chen's approach in one aspect oi the present 
invention a sunning mean is used - that is as and when the data arrives its statistics ate 
computed hi terms of the cited example the volume of water aiming evei> second {or 
some tKed multiple) is measured and runtime statistics ate mamtatned 1 he lesult is 
tiieiefoie usable fiom the time the watet starts emanating itom the pipe 

15 In one embodiment of the ptesent invention the segments die 

automatically computed, where eacit segment is a speakei turn m a convasatton B\ 
gathenng together all "simdaf segments into a ciusiu, all of these speaker turns ate 
recogm?ed to correspond to one mdi\rduai i his is done when the person finishes 
speaking his oi het mm In a toundtabie of speaker <■ speech bv the vmio speaker is able 

20 to be segmented and clustered as and when it occurs, versus aftet the etvtne toundtabie 
has ended. 

{'hen aitetnamdv outlines two clustering apptoaehes m Sections 4 and 
I ! hi the clustering approach oi Section J, the audio t> fust bioken up into segments 
using the Bl( ctiterion I he clustering begins after the enure audio has been bioken uuo 
25 segments In the reai wot id, when dealing with teal-time video oi audio stieatns, the 
entne' audio can be acquired for esamplt onK after one bus the 'stop' button on the 
recoidei After bieaktnu the audio down into rndmdual segments ( hen collects them 
into ciu-ter- The number oi clusters is open to begtn with as ts the cluster membership 
Chen combines audio segments m different combinations tn order to arnve ai a glubaiiv 
30 optimi/cd set of cluster as de-ned b\ the BIC ci net ion As is stated m the fitst 
paragraph of Section 4 I the piocess is \ei\ computattonalk expensive 
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In the second 'greedC clustcnng approach t section 4 1), Chen's bailing 
point it. the same as the abo\ e it is a ^et ot indn tdual audio segments ieali?ed "aitet " the 
eutise audio aieam h<io been btoken up into segments Hicie i<- no simultaneous 
segmentation and clustering (. luster ma follows segmentation, onlv the clustenng 
5 algorithm is --hghth optimized o\ei the fust technique 1 hen states in the fust sentence 
of section 4 i that the piueess of clustering woiks bv mucins ncaast nodes Chen tan 
do this oni\ after he ha.-> access to ~ati " the nodes teach node conc-sponds to a segment) 
In line \ paragraph 2 of Section 4 1 C hen teacher "let s Jsl, s2 sM be the current 
set ot nodes Here, C hen taeit!\ states that theie is access to all the segments labeled 

10 st thiuugh >k where k js the total number of segments (nodes k t e , there is access to all 
the audio that is de-sued to be anaiwed in am teal-time application, such ai- streaming 
audio oi streaming \ideo theie is access tu ail that has uanspued thus fai Thus, 
segmentation and clustering can onh be done based on past e\ ems 
Additional C tted Keterenccs 

15 Kleidei et al ^as also cued bs the Examiner in i electing claim 15 for. its 

disclosure that the mfoimatton of the opeakei model data include a speak et name 
\ppeliatits note that the imemors listed in I oiled Slates Patent Number ^ J 57,763 
(teteired to bs the bvimmer iu the hnai Othce \ctiont aie not Kleidei et a! \ppellants 
did find howe\er, Cnued Mate> Patent Numbet \'>30,748 rn the Notice of References 

20 Cited and respond to that refei ence below 

Appellant note that kleidet et al directed to a "method oi tdentifsing 
an individual fiom a piedetermmed set of mdnsduais usmy a speech sample spoken bv 
the indn idual The speech sample comprises a plurality of spoken ntteiance and each 
individual of tiie set has predeteimmed speakta model data' Cited, suraman ot the 

25 ln\ ention Kleider et al do not addicts the issue of ^egmentiny speech 

liuis Kleidei et ai do not disclose or suggest a 'method of tiaektng a 
speaker m an audio -vuree, said method ompnsmg the steps of sdentifung potential 
segment boundaries iu said audio source, and clustenng homogeneous segments tiom 
^aid audio souiee substantial coucunentK with said idemihmg step/ as leqiuied b\ 

30 independent claims 1, 1<>, ii, C and 3> of the present invention Simikuh, 
independent claims 2 4. 34 and ^ letjune that the segmentation and clustering are 
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performed on the "same pass" through said audio source. 
Conclusion 

The rejections of the cited claims under sections 102 and 103 in view of 
5 Chen, Kleider et al. or well known prior art, alone or in any combination, are therefore 
believed to be improper and should be withdrawn. The remaining rejected dependent 
claims are believed allowable for at least the reasons identified above with respect to the 
independent claims. 

The attention of the Examiner and the Appeal Board to this matter is 

10 appreciated. 

Respectfully, 

15 Date: May 7, 2008 Kevin M. Mason 

Attorney for Applicant(s) 
Reg. No. 36,597 
Ryan, Mason & Lewis, LLP 
1300 Post Road, Suite 205 

20 Fairfield, CT 06824 

(203)255-6560 
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APPENDIX 

L \ method foi tKk.hnL< a speaket in an audio somee, said method 

comprising the steps of: 
5 ulenufj. my potential segment bouodaues m vnd audio snuice, ami 

clustering homogeneous .segments irom said audio source substantial K 
concnnentlv with said ulenurvmg step 

2, The method of claim 1 „ wherein said identifying step identities segment 
1 0 boundaries using a BIC model-selection criterion. 

3, The method of claim 2, wherein a first model assumes there is no 

boundary in a portion of said audio source and a second model assumes there is a 
boundary in said portion of said audio source. 

15 

4 The method of claim 2, w her mi a given sample J tn said audio soutce n> 

likek to he segment buimdan d the foUmurg expression is ntgame 



1 



/. Ui* ; )\oan 



where is the determinant of the covariance of the window of ail n samples;, lit! is the 
determinant of the covariance of the first subdivision of the window, and iXJ is the 
determinant of the covariance of the second subdivision of the window. 

5. The method of claim 1, wherein said identifying step considers a smaller 

window size., m of samples in areas where a segment, boundary is unlikely to occur. 
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o The method of claim * wherein said window st/e n, js mtt cased m a 

iciatneh slow manner when the window oize is small and nit teases m a fastJi mantlet 
when the wuiduw »i/e is iaiger. 

5 7 The method of claim 5 wheiein said wmdiw si/e n is initialized to a 

mmmwm \alue artei a segment boundatx is detected 

8 I he method ot claim 2, wherein said BK mode! ^election test is not 

pet termed at the border of each window of sampler 

10 

" i he method of claim whetem said BSC mode! election test ts not 

performed when the window size, n, exceeds a predefined threshold. 

iO The method of claim S wherein said clustering step is peifotmed using a 

15 BUS model-selection cn tenon 

i ! I he method of danu 10. wherein a fust model assumes that two segments 

or clusters should be merged, and a second model assumes that said two segments or 
clusters should be maintained independently 

20 

\Z "Ihe method uf claim 1 h fuither compnsmg the step of merging said two 

tiusteis if a difference m BSC \ a lues for each ot said models is positne 

13 The method of claim I, wherein said clustering step is performed using K 

25 pre\ iously identified clusters and M segments to be clustered. 

!■■!. The method of claim !. further comprising the step of assigning a cluster 

identifier to each of said clusters 

30 15. The method of claim i, further comprising the step of processing said 

audio source with a speaker identification engine to assign a speaker name to each of said 
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clusters, 

lf> A method lot {tacking a speaket m an audio soukc* said method 

comprising the steps of: 
5 identiiwng potential segment boundanc-s tn said audio aouice. and 

Jtistennv: segments fiom saxi audio source: couespondmu to the same 
speaket suhstaiuialb cone ut tenth \\iths ( ud identiiying step 

17 The method of claim !6. v\heiem said identifunti step identities seamen t 
10 boundanes using a BK modcUsdecnuu cntenon 

18 1 he method of claim 1? uhetem a fitst mode! assumes the i c is no 
boundan tn a pomon of said audio souice and a sevond mode! assumes there is a 
boundau tn «afd poitton of^aid audio ootnee 

15 

1° i he method of claim to whet em said identify ing step eonssdess a smaliei 

uiirIow si a 1 n of samples m aieas \Utete a segtnent hotiudan is unhkelv to ucuji 

20. The method of claim 17, wherein said BIG model selection test is not 

20 pesfotmed whete the detection of a noundars is uuhkeK to occui 

21 Ihe method of claim K\ whet em said clustering step is perfotmed using a 
BK' model-selection l men on, whcie a fust model assume-* that two segments oi ciusAeis. 
should be merged, and a second model assumes that said Two segments ht clustets should 

25 be maintained mdependenth 

22 I he method of clatm 1 1>, v. heiem said clustering step ss perfotmed using K 
pie\tousi\ identified clusteis and M segments, to be clusteied 

30 2> A method lot tiackmg a speaket m an audio soukc $aid method 

comprising the steps of: 
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identiiwng potential segment boundaries (luting a pass fiuoueh -.did audio 

source: and 

clustenug segment.-, iiom vul audio .-ouice corresponding to ihe same 
sp^akei duuog said same pass tiuough said audio aouict 

5 

24 i he method of (.(.urn 2 \ wheiein s<tid iduuif\ mg step identifies segment 

boundaues usnm a BK model-^dcaion uiknon 

2^ Ihe method of cia-m 24 v hen. in a fiist model assumes there is no 

10 bouudan in a portion of said audio soutce and a second model assumes tbete is a 
boundan in said portion of said audio souat 

26 Ihe method of claim 2* \\ herein said ideTiufvmti step con sulci a a smaller 
window st/e n, o{ samples tn aieas uheie a segment boundan t< unhkeh to ocelli 

15 

27 ihe method or claim 24 whet em said B1C model selection test is nut 
pu formed wheie the detection oi a boundatx is unhkeh to occut 

2is The method of daim 23. whet em said clustering step is peifoimed using a 

20 BUS model-selection cntenon. whete a fust model assumes thai mo segments or clusters 
should be merged and a second model assumes that said two segments oi chbieic should 
be maintained independently. 

2'' The method of claim 2 \ w heiem said clustering step is petioimed u^ing K 

25 pie\ iousK identified ciui>teii> and M ^eumcnb to be clusteied 

30 \ system foi nackmg a opeakei in an audio souice compusmg 

a mcmoix that stoics computeweadable code and 

a pjoLcssoj operati\el\ coupled to said memoix , said pioee^soi configuicd 
30 to implement said computes -itadable code said computet -loadable code configured to 
itkntitx potential segment boundaues m said Audio ,ot»ce, and 
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dustei huiuogcneous segments from said audio source itiuVuuiuaih 
coiicurreiuK with said identification of segment houndanes 

3 1 . An article of manufacture, comprising: 

5 a computer readable medium having computer readable code means 

embodied thereon, said computer readable program code means comprising: 

a step to identify potential segment, boundaries in said audio source: and 
a step to cluster homogeneous segments from said audio source 
substantially concurrently with said identification of segment boundaries. 

10 

32. A system for tracking a speaker in an audio source, comprising: 
a memory that stores computer-readable code: and 

a processor operativeiy coupled to said memory, said processor configured 
to implement said computer-readable code, said computer- readable code configured to: 



15 identify potential segment boundaries in said audio source: and 

cluster segments from said audio source corresponding to the same 
speaker substantially concurrently with said identification of segment boundaries. 

33. An article of manufacture, comprising: 

20 a computer i tradable medium hauuii computet readable code means 



embodied theteot). ,<\td computet loadable ptogiam code meaiia compnsmg 

a step to identity potential segment boundaries tn sanl audio souue and 
a step to cluster segments tium said audio ^ouice eottcspondmg to the 

same j-peukei substantial eoneuueuth \wth t atd identification of segment houndanes 

25 

1 A svvstem lot Hacking a speaket m an audio souice, unupmmg 

a memot\ that stoics computes -leadable code, and 

a processor opeiatiseK coupled to said memon said piotes^ot conftgiucd 
to implement said eumputei-ieadahle code said cumpukr-itadahle code eonfmuied tit 
30 identif\ potential segment houndanes durmg a pass thuutgh said audio 

source: and 
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dustei segment itom <-aid audio ,-ouice cot respond in g to the same 
speakci dunnu said same pas*; throi eh ^aid audio source 

,*5 An at tide of manufacture, eompusmg 

5 a computet loadable medium hautig computer Kadable code mean;> 

embodied (iKicOii s<ud computet ieadabie pioauun «.ode meat^ compitMUif 

a aiep to tdenuh potential segment boundancs during a pass thiough said 
audio source: and 

a i>tep to cius-tei segment* fiom .said audio source conespondmg to the 
10 same ^peaku duttng said same pas> thiough *aid audio somee 
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EVIDENCE APPENDIX 
i hei e 1^ no c\ idcnce submitted pin Miant to ^ 1 . 1 30, 1 . 1 3 1 s or 1 . 1 32 or 
eutesed in the Pvitmnef .tiidsehed upon b> .ippellan! 
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RH MID PRO( 1 FDlVCiS VPITNDIX 
i heiv are no known dec Mon» icndcicd h\ a court oi the Board in any 
idciitii'ied pussuam io pautyi.tph (t)( 1 Ksi) oi ^ < i R 41 ^7 
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